NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

MGDA Converges under Generalized Smoothness, Provably

Zhang, Qi; Xiao, Peiyao; Zou, Shaofeng; Ji, Kaiyi (April 2025, The Thirteenth International Conference on Learning Representations (ICLR))

Free, publicly-accessible full text available April 28, 2026
MGDA Converges under Generalized Smoothness, Provably

Zhang, Qi; Xiao, Peiyao; Zou, Shaofeng; Ji, Kaiyi (April 2025, ICML)

Free, publicly-accessible full text available April 28, 2026
Tuning-Free Bilevel Optimization: New Algorithms and Convergence Analysis

Yang, Yifan; Ban, Hao; Huang, Minhui; Ma, Shiqian; Ji, Kaiyi (May 2025, International Conference on Learning Representations)

Bilevel optimization has recently attracted considerable attention due to its abundant applications in machine learning problems. However, existing methods rely on prior knowledge of problem parameters to determine stepsizes, resulting in significant effort in tuning stepsizes when these parameters are unknown. In this paper, we propose two novel tuning-free algorithms, D-TFBO and S-TFBO. D-TFBO employs a double-loop structure with stepsizes adaptively adjusted by the "inverse of cumulative gradient norms" strategy. S-TFBO features a simpler fully single-loop structure that updates three variables simultaneously with a theory-motivated joint design of adaptive stepsizes for all variables. We provide a comprehensive convergence analysis for both algorithms and show that D-TFBO and S-TFBO respectively require $$\mathcal{O}(\frac{1}{\epsilon})$$ and $$\mathcal{O}(\frac{1}{\epsilon}\log^4(\frac{1}{\epsilon}))$$ iterations to find an $$\epsilon$$-accurate stationary point, (nearly) matching their well-tuned counterparts using the information of problem parameters. Experiments on various problems show that our methods achieve performance comparable to existing well-tuned approaches, while being more robust to the selection of initial stepsizes. To the best of our knowledge, our methods are the first to completely eliminate the need for stepsize tuning, while achieving theoretical guarantees.
more » « less
Free, publicly-accessible full text available May 1, 2026
First-Order Federated Bilevel Learning

https://doi.org/10.1609/aaai.v39i21.34355

Yang, Yifan; Xiao, Peiyao; Ma, Shiqian; Ji, Kaiyi (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

Federated bilevel optimization (FBO) has garnered significant attention lately, driven by its promising applications in meta-learning and hyperparameter optimization. Existing algorithms generally aim to approximate the gradient of the upper-level objective function (hypergradient) in the federated setting. However, because of the nonlinearity of the hypergradient and client drift, they often involve complicated computations. These computations, like multiple optimization sub-loops and second-order derivative evaluations, end up with significant memory consumption and high computational costs. In this paper, we propose a computationally and memory-efficient FBO algorithm named MemFBO. MemFBO features a fully single-loop structure with all involved variables updated simultaneously, and uses only first-order gradient information for all local updates. We show that MemFBO exhibits a linear convergence speedup with milder assumptions in both partial and full client participation scenarios. We further implement MemFBO in a novel FBO application for federated data cleaning. Our experiments, conducted on this application and federated hyper-representation, demonstrate the effectiveness of the proposed algorithm.
more » « less
Free, publicly-accessible full text available April 11, 2026
First-Order Federated Bilevel Learning

Yang, Yifan; Xiao, Peiyao; Ma, Shiqian; Ji, Kaiyi (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

Federated bilevel optimization (FBO) has garnered significant attention lately, driven by its promising applications in meta-learning and hyperparameter optimization. Existing algorithms generally aim to approximate the gradient of the upper-level objective function (hypergradient) in the federated setting. However, because of the nonlinearity of the hypergradient and client drift, they often involve complicated computations. These computations, like multiple optimization sub-loops and second-order derivative evaluations, end up with significant memory consumption and high computational costs. In this paper, we propose a computationally and memory-efficient FBO algorithm named MemFBO. MemFBO features a fully single-loop structure with all involved variables updated simultaneously, and uses only first-order gradient information for all local updates. We show that MemFBO exhibits a linear convergence speedup with milder assumptions in both partial and full client participation scenarios. We further implement MemFBO in a novel FBO application for federated data cleaning. Our experiments, conducted on this application and federated hyper-representation, demonstrate the effectiveness of the proposed algorithm.
more » « less
Free, publicly-accessible full text available April 11, 2026
Tuning-Free Bilevel Optimization: New Algorithms and Convergence Analysis

Yang, Yifan; Ban, Hao; Huang, Minhui; Ma, Shiqian; Ji, Kaiyi (April 2025, ICLR)

Free, publicly-accessible full text available April 1, 2026
First-Order Minimax Bilevel Optimization

Yang, Yifan; Si, Zhaofeng; Lyu, Siwei; Ji, Kaiyi (December 2024, Advances in Neural Information Processing Systems)

Multi-block minimax bilevel optimization has been studied recently due to its great potential in multi-task learning, robust machine learning, and few-shot learning. However, due to the complex three-level optimization structure, existing algorithms often suffer from issues such as high computing costs due to the second-order model derivatives or high memory consumption in storing all blocks’ parameters. In this paper, we tackle these challenges by proposing two novel fully first-order algorithms named FOSL and MemCS. FOSL features a fully single-loop structure by updating all three variables simultaneously, and MemCS is a memory-efficient double-loop algorithm with cold-start initialization. We provide a comprehensive convergence analysis for both algorithms under full and partial block participation, and show that their sample complexities match or outperform those of the same type of methods in standard bilevel optimization. We evaluate our methods in two applications: the recently proposed multi-task deep AUC maximization and a novel rank-based robust meta-learning. Our methods consistently improve over existing methods with better performance over various datasets.
more » « less
Full Text Available
Efficiently Escaping Saddle Points in Bilevel Optimization

Huang, Minhui; Chen, Xuxing; Ji, Kaiyi; Ma, Shiqian; Lai, Lifeng (January 2025, Journal of machine learning research)

Bilevel optimization is one of the fundamental problems in machine learning and optimization. Recent theoretical developments in bilevel optimization focus on finding the firstorder stationary points for nonconvex-strongly-convex cases. In this paper, we analyze algorithms that can escape saddle points in nonconvex-strongly-convex bilevel optimization. Specifically, we show that the perturbed approximate implicit differentiation (AID) with a warm start strategy finds an -approximate local minimum of bilevel optimization in ̃O(−2) iterations with high probability. Moreover, we propose an inexact NEgativecurvature-Originated-from-Noise Algorithm (iNEON), an algorithm that can escape saddle point and find local minimum of stochastic bilevel optimization. As a by-product, we provide the first nonasymptotic analysis of perturbed multi-step gradient descent ascent (GDmax) algorithm that converges to local minimax point for minimax problems.
more » « less
Full Text Available
Efficiently Escaping Saddle Points in Bilevel Optimization

Huang, Minhui; Chen, Xuxing; Ji, Kaiyi; Ma, Shiqian; Lai, Lifeng (January 2025, Journal of machine learning research)

Bilevel optimization is one of the fundamental problems in machine learning and optimization. Recent theoretical developments in bilevel optimization focus on finding the first-order stationary points for nonconvex-strongly-convex cases. In this paper, we analyze algorithms that can escape saddle points in nonconvex-strongly-convex bilevel optimization. Specifically, we show that the perturbed approximate implicit differentiation (AID) with a warm start strategy finds an ϵ-approximate local minimum of bilevel optimization in $$\tilde O(\epsilon^{-2})$$ iterations with high probability. Moreover, we propose an inexact NEgative-curvature-Originated-from-Noise Algorithm (iNEON), an algorithm that can escape saddle point and find local minimum of stochastic bilevel optimization. As a by-product, we provide the first nonasymptotic analysis of perturbed multi-step gradient descent ascent (GDmax) algorithm that converges to local minimax point for minimax problems.
more » « less
Full Text Available
AUC-CL: A Batchsize-Robust Framework for Self-Supervised Contrastive Representation Learning

Sharma, Rohan; Ji, Kaiyi; Xu, Zhiqiang; Chen, Changyou (April 2024, International Conference on Learning Representations)

Full Text Available

« Prev Next »

Search for: All records